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(54)ntle: METHOD AND SYSTEM FOR CREDIT-BASED DATA FLOW CONTROL 
(57) Abstract 

Methods and systems for controlling data 
flow between a sender and a receiver include 
communicating credit lists to the sender. The 
credit lists include credits indicative of receive 
buffer sizes accessible by the receiver and ca- 
pable of receiving data. The sender transmits 
data packets to the receiver. The data packets 
are preferably no greater in size than the cred- 
its specified in the credit list When the sender 
uses all of the credits, the sender preferably re- 
frains from sending data packets to the receiver 
unul the supply of credits is replenished by the 
receiver. Because data flow between the sender 
and the receiver is regulated using credits, the 
likelihood of data overflow errors is reduced and 
communication efficiency is increased. 
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METHOD AND SYSTEM FOR CREDIT-BASED DATA FLOW CONTROL 

This application claims the benefit of U.S. Provisional Patent Application No. 
60/095,297 filed August 4, 1998, the disclosure of which is incorporated herein by 
reference in its entirety. 

TECHNICAL FIELD 

The present invention relates to methods and systems for controlling data flow 
between sending and receiving processes executing on one or more computers. More 
particularly, the present invention relates to methods and systems for controlling data 
flow between a sender and a receiver, each including one or more computer processes, by 
15 communicating credits from the receiver to the sender indicating receive buffer sizes with 
reduced copying of data between sending and receiving applications. 

BACKGROUND OF THE TNVFNHnnivr 

In computer communication systems, it is desirable to control the flow of data 
2 0 from a sending process to a receiving process. For example, if a sending process sends 
data to a receiving process faster than the receiving process can receive and process the 
data, data may be lost or overwritten. Similarly, if a sending process sends data and the 
receiving process fefls to provide a buffer to receive the data, the connection between the 
sending and receiving processes may be broken. 
25 In conventional flow control techniques, such as TCP flow control techniques, 
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Another problem with conventional TCP flow control methods is that the TCP 
buffer size information communicated by a TCP receiver may not reflect the actual 
available TCP buffer size. For example, conventional TCP protocol software may 
advertise to the sender an upper limit on the number of bytes that a TCP buffer is capable 
5 of receiving. This upper limit may not reflect the actual memory space reserved for the 
TCP buffer when data arrives from the sender. Thus, conventional flow control methods 
may not communicate accurate buffer size information to the sender. 

Yet another problem with TCP flow control methods is that the copying of data 
between the TCP buffers and the sending and receiving application buffers introduces 

10 latency into data transfers. As a result of this latency, these methods may not be feasible 
in high-speed environments, such as system area networks (SANs). For example, in 
TCP, data may be copied from a sender's application-level buffer to the sender's TCP 
buffer and from a receiver's TCP buffer to the receiver's application-level buffer. This 
copying may have a significant impact on I/O performance in high-speed environments. 

15 In order to increase I/O performance over conventional communications 

protocols, some communication protocols, such as the Virtual Interface Architecture 
(VIA), do not buffer data for an application or perform fragmentation and reassembly of 
data. Data is sent from a sending I/O device, over a network, and received directly into 
an application-level receive buffer of a receiver. If a sender utilizing the VIA architecture 

2 0 attempts to send data when a receive buffer is not available, connection between the 
sender and receiver is broken. The breaking of a connection is a catastrophic, 
unrecoverable error, that requires reestablishment of the connection and resending of the 
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data from a send buffer associated with a sender to a receive buffer associated with a 
receiver. In a preferred implementation of the invention, the only copy of the data made 
between the send buffer and the receive buffer may be the signal transmitted over the 
communication link between the sender and the receiver. Copying of data increases time 
5 required to process an I/O request Thus, reducing the number of copies between the send 
buffer and the receive buffer increases transmission efficiency. 

In order to control the flow of data without copying the data, the receiver may 
communicate application-level receive buffer sizes to the sender. The receiver preferably 
communicates the buffer size information to the sender in an efficient manner. For 

1 0 example, the more buffer size information communicated to the sender in each flow 
control communication, the more efficient the communication process. In one 
implementation, the receiver may communicate a list containing at least one application- 
level receive buffer size to the sender, so that the sender can determine how much data 
the receiver is capable of receiving. In preferred implementations of the invention, the 

1 5 receiver may send a list containing a plurality of application-level receive buffer sizes to 
the sender. One method for communicating the list of buffer sizes to the sender is by 
sending a message, e.g., a packet, from the receiver to the sender over a data channel 
established between the sender and the receiver. The message may contain the list of 
receive buffer sizes, and is hereinafter referred to as a credit message. The receive buffer 

20 sizes in the credit message are hereinafter referred to as credits. 

The sender may utilize the credits in the credit message to determine the size and 
order of data packets to be sent to the receiver. For example, the sender preferably does 
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sender may use the credits in the manner previously described to determine how to 
partition and send data to the receiver. 

In implementations of the invention where credit messages are used to deliver 
credits to the sender, the credit messages may be delivered using a new protocol or by 
5 extending an existing protocol. For example, in a new protocol, the sender and the 

receiver may exchange credit messages over a control channel established exclusively for 
the exchange of credit messages. In order to extend an existing protocol, credits may be 
communicated to the sender using optional data fields in the existing protocol. For 
example, in TCP, credits may be communicated to the sender using the OPTIONS field 

10 in any TCP packet, such as a TCP acknowledgment packet The TCP sender may then 
send data to the receiver having lengths corresponding to the credits. 

According to another aspect, the present invention may include methods and 
systems for determining when to communicate credits to a sender. The receiver 
preferably communicates credits to the sender in a timely manner. For example, if the 

1 5 sender has data to be sent and the receiver fails to timely notify the sender of the available 
receive buffer space, sending may be delayed. In order to avoid delays in sending, the 
receiver may monitor credits sent to the sender, the rate at which the sender uses the 
credits, and/or when the sender uses particular credits in a credit list previously 
communicated to the sender. Based on the monitored information, the receiver may 

2 0 determine when to communicate new credits to the sender to avoid the condition where 
the sender has data to send but has no credits. For example, the receiver may 
communicate new credits to the sender after receiving data from the sender into a first 
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sender and receiver, whichever is smaller. Thus, based on these rules, the present 
invention reliably implements flow control of credits. 

According to another aspect, the present invention includes a method for 
controlling data flow between a sender and a receiver. The method includes 
5 communicating a first credit list to a sender. The first credit list may include a plurality 
of credits indicative of buffer sizes of receive buffers accessible by the receiver and 
capable of receiving data from the sender. In response to receiving the first credit list, the 
sender transmits a data packet to the receiver. The data packet is no greater in size than a 
first buffer size specified by a first credit in the first credit list 
1 0 According to another aspect, the present invention includes a credit list 

builder/communicator including computer-executable instructions embodied in a 
computer-readable medium for performing steps. The steps may include receiving 
requests for receiving data into a plurality of receive buffers accessible by a receiver and 
capable of receiving data from a sender. In response to the requests, the credit list 
1 5 builder/communicator may build a credit list including a plurality of credits indicative of 
sizes of a plurality of receive buffers. After building a credit list, the credit list 
builder/communicator may communicate the credit list to the sender. 

According to another aspect, the present invention may include a data structure 
for controlling data flow between a sender and a receiver. The data structure may include 
20 a credit list including a plurality of credits. Each credit in the credit list is indicative of a 
buffer size of a receive buffer accessible by a receiver and capable of receiving data from 
a sender. 
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determine when to communicate a second credit list to the receiver based on the 
frequency. For example, the credit list builder/communicator may determine a triggering 
buffer corresponding to a credit in the first credit list based on the frequency. The credit 
list builder/communicator may instruct an input/output device to send the second credit 
5 message to the sender when the triggering buffer receives data. In an alternative 
arrangement, rather than determining a triggering buffer, the credit list 
builder/communicator may determine a time in time units, such as milliseconds, for 
determining when to send a new credit message to the sender, based on the frequency. 
According to another aspect of the invention, the receiver may u tilize credits to 
1 0 implement quality of service features. For example, the receiver may be a server that 
provides services to a plurality of client senders. Since the server may concurrently 
receive data from multiple clients, it may be desirable for the server to impose a 
maximum allowable bandwidth restriction on each clients, to prevent the server from 
being overrun with data. One way that the sender may control the bandwidth is by 
15 regulating the number of unused credits available to each client so that no client has 

enough credits to exceed the maximum allowable bandwidth. By using available credits 
to regulate maximum bandwidth for each client, the server piflintains a given quality of 
service for all clients. 

According to another aspect, the present invention may include a credit list 
2 0 builder/communicator including computer-executable instructions embodied in a 

computer-readable medium for performing steps. The steps may include operating in a 
first mode for determining when to communicate new credits to a sender. The credit list 
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coprocessor. A>temative,y, th. memory device may comprise a memory chip externa, 
10 to the chip containing the processing circuit The memory device may emnpnse a 
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^ons. Alternatively. Ore memory device may comprise an appUcanon specific 
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According to another aspec, die present invention may include an inpuuoutpnt 
device . Tbe input/output device may include a processing circuit and a memory device, 
as previously described. The computer-ex^le insmmuoos include m or 
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implemented by the memory device may perform steps. The steps may include posting a 
first buffer accessible by a sender for receiving credits from a receiver. The next step 
may include determining whether credits have been received in the first buffer. In 
response to receiving credits in the first buffer, the next step may include posting a 
5 second buffer accessible by the sender for receiving additional credits from the receiver. 
After posting the second buffer, the next step may include storing credits from the first 
buffer in a credit list 

According to another aspect, the present invention may include a network 
communications system. The network communication system may include a first local 

1 0 virtual interface, a second local virtual interface, and a credit list builder/communicator. 
The first local virtual interface may send data to and receive data from a first remote 
virtual interface over a first network connection. The second local virtual interface may 
send credit messages to and receive credit messages from a second remote virtual 
interface over a second network connection. The credit list builder/communicator may 

1 5 build credit messages for controlling data flow over the first network connection and 
communicate the credit messages to the second remote virtual interface through the 
second local virtual interface and the second network connection. The credit messages 
may include credit lists including a plurality of credits indicative of buffer sizes of receive 
buffers for receiving data through the first local virtual interface from the first remote 

2 0 virtual interface. Alternatively, each virtual interface may be used to communicate data in 
one direction while communicating credit messages in the reverse direction. 
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Additional features and advantages of the invention will be made apparent from 
the following detailed description of illustrative embodiments which proceeds with 
reference to the accompanying figures. 

5 UPTFF ttESCP TPTTON OF THE DRAWING^ 

While the appended claims set forth the features of the present invention with 
particularity, the invention, together with its objects and advantages may be best 
understood from the following detailed description taken in conjunction with the 

accompanying drawings of which: 
10 Figure 1 is a block diagram generally illustrating an exemplary computer system 

on which embodiments of the present invention may reside; 

Figure 2 is a block diagram Ulustrating a sender and a receiver including a system 
for controlling data flow according to an embodiment of the present invention; 

Figure 3 is a more detailed block diagram of the sender and the receiver including 
15 the system for controlling data flow according to the embodiment of Figure 2; 

Figure 3(a) is a detailed block diagram of the sender and the receiver according to 
an alternative embodiment of the invention; 

Figure 4 is a flow chart illustrating steps that may be performed by a credit list 
hufldex/communicator of a receiver for detexrxnning when to communicate new credits to 
2 0 a sender according to an embodiment of the present invention; 

Figure 5 is a flow chart illustrating exemplary steps that may be performed by a 
cn^tlistbuflder/conunumcatorofareceiverforde 
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credits to a sender according to another embodiment of the present invention; 

Figure 6 is a flow chart illustrating exemplary steps that may be performed by a 
credit list builder/communicator of a receiver for determining whether to switch from a 
first mode to a second mode for determining when to communicate new credits to a 
5 sender according to an embodiment of the present invention; 

Figures 7(a) and 7(b) are flow charts illustrating exemplary steps that may be 
performed by a credit list reader/processor of a sender for reading and processing credits 
according to an embodiment of the present invention; 

Figure 8 is a flow diagram illustrating an example of the transfer of credits to and 
10 the use of credits by a sender according to an embodiment of the present invention. 

SPECIFIC DESCRIPTION OF THE INVENTION 

Turning to the drawings, wherein like reference numerals refer to like elements, 
the invention is illustrated as being implemented in a suitable computing environment 

1 5 Although not required, the invention will be described in the general context of computer- 
executable instructions, such as program modules, being executed by a personal 
computer. Generally, program modules include routines, programs, objects, components, 
data structures, etc. that perform particular tasks or implement particular abstract data 
types. Moreover, those skilled in the art will appreciate that the invention may be 

2 0 practiced with other computer system configurations, including hand-held devices, multi- 
processor systems, microprocessor based or programmable consumer electronics, 
network PCs, minicomputers, mainframe computers, and the like. The invention may 
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^ „ practice in debuted computing environment where tasks are performed by 
^note processing devices .hat are linked through a creations network. In a 
^^^^^^^^^^^^^ 

remote memory storage devices. 

V *h reference «o Fig. 1 exempt system for imp.ementing the invention 
taW- a general purpose cumputing device in me fonn of a enactions, personal 
co ^20.mch^. I «^^•»^^ a ^ aS ^ b,I,23 
unit21. ^sysmmb.s^maybeanyofsevend^ofnuss.r^inchKmrga 
10 memory bus or memory controller, a peripheral bus, and a local bus using any of a variety 
ofbm archil Th. system memory includes read only memory (ROM) 24 and 
random access memory (RAM) 25. A basic input/output system (BIOS) 26, containing 

^ 20, such as during s*rt-u P . is stored in ROM 24. The P«sonaI computer 20 
15 ftrrfner includes a bard disk drive 27 for reading from and wrung to a hard disk, -« 

shown, a magnetic disk drive 2S forreading from or writing to a removable rrragnetic drsk 
29 , ^ an optical disk drive 30 for readirrg firm, or writing to a removable optica, disk 31 

such as a CD ROM or other optical media. 

j- i, or and ootical disk drive 30 are 
The hard disk drive 27, magnetic disk drive 28, and optical 

20 .oanected to me system bus 23 by a hard disk drive interface 32. a magnetic disk drive 
^ 33, and an optica, disk drive interface 34, respectively. Tne drives and m« 
a^ed computer-readab.e media provide nonvolatile *orage of computer readable 
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instructions, data structures, program modules and other data for the personal computer 
20. Although the exemplary environment described herein employs a hard disk, a 
removable magnetic disk 29, and a removable optical disk 31, it will be appreciated by 
those skilled in the art that other types of computer readable media which can store data 
5 that is accessible by a computer, such as magnetic cassettes, flash memory cards, digital 
video disks, Bernoulli cartridges, random access memories, read only memories, and the 
like may also be used in the exemplary operating environment 

A number of program modules may be stored on the hard disk, magnetic disk 29, 
optical disk 31, ROM 24 or RAM 25, including an operating system 35, one or more 

1 0 applications programs 36, other program modules 37, and program data 38. The 

operating system 35 may include a virtual memory manager and one or more I/O device 
drivers that communicate with each other to maintain coherence between virtual memory 
address mapping information stored by the operating system 35 and virtual memory 
mapping information stored by one or more I/O devices, such as network interface 

1 5 adapters 54 and 54a. A user may enter commands and information into the personal 
computer 20 through input devices such as a keyboard 40 and a pointing device 42. 
Other input devices (not shown) may include a microphone, touch panel, joystick, game 
pad, satellite dish, scanner, or the like. These and other input devices are often connected 
to the processing unit 21 through a serial port interface 46 that is coupled to the system 

2 0 bus, but may be connected by other interfaces, such as a parallel port, game port or a 
universal serial bus (USB). A monitor 47 or other type of display device is also 
connected to the system bus 23 via an interface, such as a video adapter 48. In addition to 
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*. monitor, > computers tfpicaily include other penphem. output devices, no, 

shown, such as speakers and printers. 

The persona, computer 20 may opemte in a networked environment using logic! 
connections to one or more remote computers, such as a .mote computer 49. T*. 
, rcm0 re computer 49 may be another persona! compute, a server, a router, a network PC, 
a device or other common network node, and rypicaU, includes many or an of the 

^.oeviceSOhasbeeniUusUatedinng.!. iogica! comrecdons depict in Hg. 
, mctal e a iocai area network (LAN, 51, a wide area n«work (WAN) 52, and a sys^m 
0 area network (SAN) 53. Locai- and wide-area networking environments are 

commonplace in office, enterprise-wide compurer networks, intranets and the Internet 

VaT example, in the illustrated embodiment, the 
computing system, such as a cluster. For example, in 

15 maycomprise.secondnodeindteCus.er. In such an environment, it is preferable that 
me personal computer 20 and the remote computer 49 be under a common adminisuanve 
aomam. T*us. although Ore computer 49 is Ube.ed -remote", the computer 49 may be m 
close physical proximity to die personal computer 20. 

When used in a LAN or SAN networking environment Ore personal computer 20 

« throuch the network interface 
20 isconnected to the local network 51 or system network 53 through 

, . . n A»rTtm S4 and 54a may include processing 
adapters 54 and 54a. The network interface adapters 54 and 

, A „j c 6 « The memory units 56 and 
units 55 and 55a and one or more memory units 56 and 56a. 
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56a may contain computer-executable instructions for processing I/O requests including 
translating virtual memoiy addresses to physical memory addresses, obtaining virtual 
address mapping information from the operating system 35, and recovering from local 
address translation failures. The memory units 56 and 56a may also contain page tables 
5 used to perform local virtual to physical address translations. 

When used in a WAN networking environment, the personal computer 20 
typically includes a modem 58 or other means for establishing communications over the 
WAN 52. The modem 58, which may be internal or external, is connected to the system 
bus 23 via the serial port interface 46. In a networked environment, program modules 
1 0 depicted relative to the personal computer 20, or portions thereof; may be stored in the 
remote memory storage device. It will be appreciated that the network connections 
shown are exemplary and other means of establishing a communications link between the 
computers may be used 

When used in any of the networking environments illustrated in Figure 1, data 
1 5 flow is preferably regulated between processes executing on the personal computer 20 
and processes executing on the remote computer 49 that communicate with each other. 
For example, the personal computer 20 may include a sender for sending data through 
one of the network interface adapters 54 and 54a to a receiver executing on the remote 
computer 49. Accordingly, in order to regulate data flow between the sender and the 
2 0 receiver, the sender may include a credit list reader/processor for receiving and 
processing credits from the receiver. The receiver may include a credit list 
builder/communicator for building credit lists and communicating the credit lists to the 
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sender. 

He present invention is not limited to regulating flow between processes 
executing on separate computers. The credit list buildet/communicator and the credit list 
reader/processor may be used to regulate flow between a sender and a receiver executing 
ooth e same machine. For example, the sender and the receiver earn may comprise an 
appHcation program executing on the personal compter 20 that utilize a shared memory 
region for communicating wHh«ch outer. The shared memo* region may inch.de. 
^portionandacontiol portion. In order to regulate flow, the cradi, list 
builder/processor ofthe receiver may write credit to tire contiol portion of flte shared 
n^noryregiom Tlte cmohsma, be indicative of receive buffer sizes intite data portion 
oft he shared memory region. In order m access the credits, the credit list 
r eader,processor of the sender may read tire control portion of tire snared memory regie. 
The credi, list reader/processor preferab.y uses the cradhs in the order that*. cmdUs am 
^e available, preferably does no, exceed tire buffer size indicted by each credi, and 
preferably only writes data to the data portion when credits are available, mthismanner, 
flow between me sender and tire receiver may be regulated using credi. in shared 
memory* 

ta yet another alternative, where tire sender and receiver are executing on 
differenttnachines, RDMA write operations may be used to communicate credits fiom 
0 the receiver to the sender. In RDMA write operations, the credit list 

buflder/communicator ofthe receiver may write credits duectiy to the memo* ofthe 
rrrachine on which the credit message reader processor of tire sender executes. L, order* 

- 20 - 



W ° 00/4,365 PCT/US99/30860 



perform an RDMA write operation, the credit list builder/communicator may construct a 
packet containing a list of credits and the destination memory address of the sender where 
the credits will be stored. The sender may receive the packet directly into the specified 
memory address. In order to use the credits, the credit list reader/processor may read the 
memory location that receives the RDMA packet The credit list reader/processor may 
use the credits to send data to the receiver in the manner previously described. Thus, 
RDMA write operations provide yet another mechanism for communicating credits to the 



In the description that follows, the invention will be described with reference to 
1 0 acts and symbolic representations of operations that are performed by one or more 

computers, unless indicated otherwise. As such, it will be understood that such acts and 
operations, which are at times referred to as being computer-executed, include the 
manipulation by the processing unit of the computer and/or the processing units of I/O 
devices of electrical si gnal s representing data in a structured form. This manipulation 
1 5 transforms the data or maint ains it at locations in the memory system of the computer 
and/or the memory systems of I/O devices, which reconfigures or otherwise alters the 
operation of the computer and/or the I/O devices in a manner well understood by those 
skilled in the art. The data structures where data is maintained are physical locations of 
the memory that have particular properties defined by the format of the data. However, 
2 0 while the invention is being described in the foregoing context, it is not meant to be 

limiting as those of skill in the art will appreciate that the acts and operations described 
hereinafter may also be implemented in hardware. 
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Figure 2 01uara.es an exempt sender 60 and » receiver 62 including a system 
for controlling data flow accord** to an embodiment of me present invent In ft. 
iUustratt d embodiment the sender 60 and fte recer«r 62 may eacb comprise a **** 
ofprocessesexecunngontltesantecompu^^^^ 

, ovaracornntunicadonUn^.TTrecomm^^ 

WAN aSAH.oranyomermemumfcr^errmgsignahb^com^ecteddev^ 

If the sender and the receiver inctude appUcauon programs executing on ft. same 
^fteeomrnm.e^^n^eompHsa.bus.snchas.d^bu, The sender 60 

10 more sendbu ff er S 6, & om m yOdov i ec70 to ofterappUea ti ons.Forexamp 1 e,ft. 

^ application 66 may comprise a web server ft. sends data to other applications, 

li™H<m 74 over the communication link 64. the I/O device 70 
such as the receiving apphcanon 74 over me "■»» 

^application. For example, fte I/O device 70 may ccmprise a netwo* interface 
15 adapter, such as anEftemetadapter. It, order .reduce copying of data b™ the 

taring virmal memoty addresses of data to be sen, to physical memory addresses. 
E^pUry mechanisms for elating virtual memory addresses to physics! memory 

add^ « described in copending U.S. Patent Application No . ««» 

20 December 29, 199S, entitied, -Recoverable Methods for and Systems for Processing 

dbctom of wM* is incorporated herein by reference in its entirety. 
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The sender 60 may also include an I/O device interface 72 for controlling 
communication between the sending application 66 and the I/O device 70, For example, 
the I/O device interface 72 may include communications functions, such as sockets, MPI, 
and cluster functions that may be called by the sending application when requesting 
5 sending of data. The I/O device interface 72 may convert the requests into data structures 
recognizable by the I/O device 70. In order to reduce the copying of data between the 
sending application 66 and the I/O device 70, the I/O device interface 72 may also 
include memory registration functions for registering memory used by applications with 
the I/O device 70. However, because the I/O device 70 is preferably capable of 
1 0 recovering from local virtual address translation failures, memory registration may not be 
required. 

The receiver 62 may include the receiving application 74 for requesting receipt of 
data from an I/O device 76 into one or more receive buffers 78. For example, the 
receiving application 74 may comprise a web browser that receives data sent over a 

1 5 network from other applications, such as the sending application 66. The I/O device 76 
of die receiver may comprise any device capable of sending and receiving data over a 
communication link in response to requests from the receiving application 74. The 
receiver 62 preferably also includes an I/O device interface 80 for controlling 
communication between the receiving application 74 and the I/O device 76. The I/O 

2 0 device 76 and the I/O device interface 80 may be similar in structure to the I/O device 70 
and the I/O device interface 72 of the sender and need not be further described. 

According to an important aspect of the invention, the receiver 62 communicates 
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. , „,.wc R4 cent by the sender 60. In 
^aits to the sender 60 to con.ro! the flow of data packets 

^etetUt^tn^^eonfl^ - 

^en^-^n^^^e^^ — — 

preicneacn* «w 7 « in which the receiving 

^nnrelann^ntesizeofnneornroxe^vebnxTersTSmwh. 

^^^^^ k-- — — — 

^o.n.^ve^VS.e^.Wc.nnnnn^^^^^ 



one or : 



10 trii 

. f the receive buffers 78 and forward the 

— — ^ "Z.* Mten^.thettredn 

1 «^de I /»nnnnni«n» r n M yco mm nnica«cr«n lB «on«senne r , 

buffer or through RDMA write operations, as previously described. ~-«« 

■ tnr « mav also determine when to communicate new credrts to the 
15 buUdex/conununrc^SJtnayalsod sender 60 

^60. Me*ods for determining when to commumca« new credt 
are discussed in mote detail below. 

— *— * The^r^re^pro^sor^n^be^ 

^.message, ^.wben.rec^comm^-tsroibese^ 
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using a shared memory region or an RDMA write operation, the credit message 
reader/processor may read data from the shared memory region or the buffer for receiving 
RDMA writes. 

According to an important aspect of the invention, the credit list reader/processor 
5 75 preferably uses the credits to determine the size of data packets to be sent to the 
receiver. In addition, the credit list reader/processor preferably uses the credits in the 
order that the credits were received, so that the receiver will receive data in the correct 
buffers. For example, the credit message 82 may indicate that the receiving application 
74 has a first buffer of four bytes for receiving data and a second buffer of two bytes for 
10 receiving data. The sending application 66 may have a send buffer of six bytes to be sent 
to the receiving application. Under these conditions, the credit list reader/processor 75 
may request that the I/O device 70 send a first data packet of four bytes and a second data 
packet of two bytes to the receiver 62, e.g., by communicating the virtual addresses of the 
data to be sent along with the appropriate sizes to the I/O device 70. The credit list 
1 5 reader/processor 75 preferably maintains a list of credits received from the receiver 62 
and removes credits from the list as the sender uses the credits. Thus, because the 
receiver 62 preferably communicates credits indicative of application buffer sizes to the 
sender, and the sender 60 constructs data packets having sizes based on the credits, data 
flow between the sender and the receiver may be efficiently regulated. Moreover, 
20 software copying, segmentation, and reassembly of data may not be required according to 
preferred implementations of the invention because the data packets sent to the receiver 
are preferably no greater in size than the corresponding receive buffers. 
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ta OT dcr to ^ credit messages to the sender, the credit lis. 
^/conrruunic^ 

paused for receivu*dauh™ the sender. U a preferred en.bodinren.the 
^ and dre receiver send and receive data over one or more data colons and 
5 ^credit messages overacoa^coone^ 

- — multiole data connections, the credit 
When the sender and the receiver comm.rn.cate over multiple aa 

meS sa g e omlder/communicator ma y multiplex credit or credit messages .fitted over 
^.connection. BacncredUorcredUn^gein^mu.tiplexedcon^icnW 
^^care^dataconnecdona. which it pertains. - 

peered. message reade^roceasor of tire sender ma, 

^tosend^ov^co^gdataconn^ona. to order to prevent credrt 

overflow on Ore control connection, tire credit message reader/p^ 
15 preferebjynuun^acreditnressag.bufferfeeachd^co^on. 

v a-—** the receiver may communicate with a 
In yet another alternative embodiment, the receiver may 

,^rr- » server and the receivers 
ptaalhy of sender,. For example, ft. sender may compnsc server 



sender. 
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The credit message buUder/communicator and the credit message reader/processor 
may be implemented in hardware, software, or a combination of hardware and software. 
For example, in the embodiment illustrated in Figure 2, the credit message 
builder/communicator and the credit message reader processor may be components of the 
5 communications provider software included in the I/O device interfaces 72 and 80. In an 
alternative embodiment, the credit message builder/communicator and the credit message 
reader/processor may be implemented in hardware of the I/O devices 70 and 76. 
Implementing the credit message builder/communicator and the credit message 
receiver/processor in the hardware of the I/O devices allows flow control to be performed 
1 0 transparently to the communications provider software. 

Although the embodiment illustrated in Figure 2 shows a sender 60 and a receiver 
62 respectively having a credit list reader/processor 75 and a credit list 
builder/communicator 83, the present invention is not intended to be limited to such an 
embodiment- For example, the sender 60 and the receiver 62 may each be capable of 
15 sending and receiving data. Thus, the I/O device interface 72 of the sender 60 may 

include a credit list builder/communicator 83 in addition to the credit list reader/processor 
75. Similarly, the I/O device interface 80 of the receiver 62 may include a credit list 
reader/processor 75 in addition to the credit list builder/communicator 83. 

Figure 3 is a more detailed block diagram of the sender 60 and the receiver 62 
20 illustrated in Figure 2. The sender 60 and the receiver 62 illustrated in Figure 3 

preferably implement the Virtual Interface Architecture (VIA). According to the VIA 
architecture, the efficiency of I/O operations may be increased by granting I/O devices 
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dixect access to appticauon-level data buffer, so flat copying of data between 
applications and the VO devices is no. required. In order to provide I/O devices direct 
access to application-lev.! buffers, the I/O device interfaces 72 and 80 communicate 
descriptors to 0*1/0 devices. A descriptm is a date structure containing I/O request 
jessing information, s*h as tee virtual memory address and size of a send or receive 
buffer. The I/O devices translate dte virtual memory addresses in the descriptors to 
physical memory addresses and efther send date from or receive date huo a buffer at the 
physical memory address. The buffer size information in the descripters may also be 
used by the credit list buuderfconmrunicator 83 te generate credit messages. 

In Figure 3, the I/O devices 70 and 76 preferably each comprise a VIA network 
torerface adapter capable of sending and receiving date and credit messages over me 
communication link 64. A VIA network interface adapter inay comprise any type of 
aetwork adapter capable of high-speed communications, for example, an Ethernet cari, 
sucb as a gigabit Ethernet card. In addition, the VIA network interface adapter is 
preferably capable of transiting virtual memory addresses of buffers used in I/O 
operations into physical memory addresses. 

The I/O device interfaces 72 and 80 of the sender 60 and the receiver 62 each 
comprise a plurality of components for controlling communicafions between the 
abdications 66 and 74 and the I/O devices 70 and 76. For example, the I/O device 
interface 72 of the render 60 may include an operating system commnnioation interface 
8 gandavirtualmterface(VI)useragen,89. The I/O device interface 80 of the receiver 
« naay also include an operating system communication interface 90 and a VI user agent 
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91 . The operating system communication interfaces 88 and 90 and the VI user agents 89 
and 91 of both the sender and the receiver may convert requests from the sending and 
receiving applications into data structures, such as descriptors, for processing by the I/O 
devices 70 and 76. Accordingly, the operating system communication interfaces 88 and 
5 90 may include standard communications functions for performing network I/O, such as 
sockets, MPI, cluster, or other communications functions. The VI user agents 89 and 91 
may communicate memory registration requests through communication links 92 and 93 
to VI kernel agents 94 and 95. The VI kernel agents 94 and 95 may be components of the 
operating systems of the sender and the receiver that function as device drivers for the I/O 
1 0 devices 70 and 76. The VI kernel agents 94 and 95 may receive the memory registration 
requests from the VI user agents 89 and 91 and register memory used by the sending and 
receiving applications 66 and 74 with the I/O devices 70 and 76. In addition, the VI 
kernel agents 94 and 95 may establish and break connection with remote machines. The 
VI kernel agents 94 and 95 may also manage one or more virtual interfaces, such as 
1 5 virtual interface 96 of the sender 60 and virtual interface 97 of the receiver 62. 

The virtual interfaces 96 and 97 may comprise communication interfaces between 
the sending and receiving applications 66 and 74 and the I/O devices 70 and 76. The 
virtual interface 96 of the sender 60 may include a send queue 98 and a receive queue 99. 
Similarly, the virtual interface 97 of the receiver 62 may include a send queue 100 and a 
2 0 receive queue 101. In order to request an I/O operation, the sending and receiving 

applications 66 and 74 may execute standard I/O commands, such as Winsock sendO and 
Winsock recvO- In response to these commands, the VI user agents 89 and 9 1 may post 
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data to the receiver. 
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In order to receive data from the sender, the receiving application 74 preferably 
sends receive data requests to the VI user agent 91, which posts descriptors 106 and 107 
in the receive queue 101 specifying one or more receive buffers 98 to store data from the 
sender. However, if no descriptors are posted in the receive queue 101 when the data 
5 arrives, connection between the sender and receiver may be broken. Similarly, because 
the receiver may not perform segmentation or reassembly of data, if data in a given data 
packet from the sender exceeds the size of the receive buffer in the descriptor 107 at the 
head of the receive queue 101, connection may also be broken. Accordingly, it is 
desirable to coordinate posting of descriptors in the send queue 98 of the sender with the 

1 0 posting of descriptors in the receive queue 101 of the receiver; i.e., h is desirable to 
control flow between the sender and the receiver. 

In order to control flow between the sender 60 and the receiver 62, the credit list 
builder/communicator 83 builds credit messages based on sizes of the receive buffers 
contained in receive data requests initiated by the receiving application 74. For example, 

1 5 when the receiving application posts a descriptor in the receive queue, the credit list 

builder/communicator 83 may record the size of the buffer specified by the descriptor in a 
credit message. The credit list builder/communicator 83 may repeat this process for each 
descriptor posted in the receive queue. When the number of credits in the receive queue 
reaches a predetermined value or when the credit list builder/communicator 83 

2 0 determines that the sender needs credits, the credit list builder/communicator 83 
preferably requests that the I/O device 76 send a credit message 82 to the sender. 
Methods for determining when the sender needs credits will be discussed in more detail 
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below. 

Credit messages may be sent to to sender in any suitable manner, for example, by 
posting a descriptor in the send queue .00 of to receiver containing to size and virtual 
^addr^softoeredhn^eandringmgtosend^uedc.rben. However, in 

^para* connection from to connects) for sending and receiving da* for to sending 
and receiving of credits. According* since eacb virtual interface may connecttoon. 
^ vinusl interface to form one network connection, to sender and to receiver may 
each include an additional virtual interface for sending and receiving credit messages. In 
10 addition, in an alternative embodiment of to invention, a single sender and a single 

^maycommumea^overmultipledamconne^iom. In such an embodiment to 
sender and to receiver may each include multiple virtual interfaces for to data 
connections and a single virtual hrterface for a control cormection for to exchange of 
^messages. The credit message buUder/comn— of to receiver may multip.ex 
15 credits or credi, messages sent over to control connection ,o to sender. Each credit or 
cedh message may specify to data connection or virtual interface ro which it pertains. 
^ credit message reader/processor of to sender may demultiplex to credits and use 
^dtemconuoltosendmgofdnfcovertocon^ '»«*' 
» prevent credit message overflow, to credi. message reader/processor may maintain a 
20 saparaucreoitmessagebufferforea^ 
embo dimem,toreceivermaycom^ 

cUemsender, In such ..embodiment to receiver may mclu4e one vUtua, interface for 
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sending credits to and receiving credits from each of the remote senders. That is, the 
number of credit message connections may be equal to the number or remote client 
senders. Any number of credit message connections and data connections is within the 
scope of the invention. 

5 Once a credit message is transmitted to the sender, the credit list reader/processor 

75 of the sender receives the credit message 82. Receiving a credit message may require 
the previous posting of a descriptor containing the virtual address and size of a credit 
message buffer in the receive queue 99 of the sender. Alternatively, as discussed above, 
credits may be received over a separate connection from the connection for receiving 

1 0 data. The credit list reader/processor 75 may use the credits in the credit message to 
control the posting of descriptors in the send queue 98. For example, the sender may 
have a send buffer containing six bytes of data to be sent to the receiver. The credit 
message 82 from the receiver may contain a first credit of four bytes and a second credit 
of two bytes, indicating that the descriptors 106 and 107 specify two-byte and four-byte 

1 5 receive buffers, respectively. Accordingly, the credit list reader/processor 75 may post a 
first descriptor 1 03 in the send queue 98 containing a pointer pointing to the first byte of 
the send buffer with a size of four bytes and a second descriptor 1 02 in the send queue 98 
containing a pointer pointing to the fifth byte of the send buffer 68 with a size of two 
bytes. The I/O device 70 may process the descriptor 1 03 and transmit a first data packet 

2 0 having four bytes of data to the receiver. The I/O device 70 may process the second 
descriptor 102 and transmit a second data packet of two bytes of data to the receiver. 
When the receiver receives the data packets, the receiver processes the descriptor 107, 
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a a*+* «*rVets in four- and two-byte receive 
then the descriptor 1 06, to store the reeetved data packets 

hu*rs,r.spective,y. In *is — . - — «" — *» " 

e^eofreceiving. As a resul, da. .ranstnission overflew — are reduced aod 

transmission efficiency is increased. 

j . , the credit list builder/communicator 83 

In the embodiment illustrated m Figure 3, the credit list 

r.^Vtiv implemented in software, e.g., in the 
end flte eredh list reader/processor 75 are preferably tmplemen 

V, user agents 91 and 89. In an alternative embodiment the credit lis, 

htnlder/connnnnicator and the credit lis. reade^rocessor may be intp.enren.ed tn 



hardware, e.g., in 



hardware of the I/O devices. Implementing the credit list 



hnfld^eoonnnnicatoranaflaecr^Us.^^^ 
devices allows flow control fimctions to be performed transparently to flte 
communications software of the sender and me receiver. 

^SWmu^.deto^blochdU^ofasend^and.^m^ 

thecrea.Hs.btrilder/-^^ 

^ componen. of me VO device 70 of the sender and the cmdi. lis. 

♦ • vi m rrcl(*\ are the same as those illustrated in Figure 3, 
The remaining components in Figure 3(a) are tne sam 

and their descriptions are therefore not repeated. 

, .u- flow 0 f data from the sender, the credit list 
20 in order to regulate the flow oiaaauw 

hnnoencommunic^ 

descripto.postodlnmereeeivcnene.O,. ^ hs. of cr^tsmr* be stored* memo* 
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of the I/O device 76 or in memory of the host computer in which the I/O device is 
inserted. The credit list builder/communicator 83a may send the credit list to the sender 
by instructing the I/O device 76 to send the list directly from the memory location in 
which the credit list is stored. The credit list builder/communicator 83 a may also 
5 determine when to communicate new credits to the sender as will be discussed in more 
detail below. 

The credit list reader/processor 75a may receive the credit list and process the 
credits in order to send data to the receiver. However, unlike the credit list 
reader/processor 75 illustrated in Figure 3, rather than posting descriptors in the send 

1 0 queue, the credit list reader/processor 75a may control the sending of data specified by 
descriptors previously posted in the send queue 98 of the sender so that the size of data 
packets actually sent to the receiver corresponds to the credits. For example, a descriptor 
specifying the sending of eleven bytes of data may be located at the head of the send 
queue 98. The credit list reader/processor 75a may have two credits of five bytes and six 

1 5 bytes. Accordingly, the credit list reader/processor 75a may break the data buffer 

specified by the descriptor into a first data packet of five bytes and a second data packet 
of six bytes. Thus, when the credit list reader/processor and the credit list 
builder/communicator are implemented in hardware, flow control can be achieved 
transparently to the VI user agents 88 and 90. 

20 As stated above, the credit list builder/communicator 83 preferably determines 

when to communicate new credits to the sender. Determining when to provide the sender 
with new credits may be accomplished in any number of ways. Figure 4 illustrates 
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exemplary steps which may be performed by the credit list bunder/communicator 83 to 
determine when to communicafe tew credits to the sender. In step ST1, the credit list 
builder/communicator 83 may receive requests for receiving data fiom the receiving 
application 74. The credit list bnflder/commnnicator 83 preferably detennines the sto of 
5 ,he receive bufferin each request and adds a credit of a eonesponding size to a credit list 
Step ST1 is preferably executed repeatedly and eoncuneatly with the remaining steps in 
Figure 4 to accumulate credits as requests ate received fiom the receiving application 74. 
In steps ST2 and ST3, the credit list buUdericommunicator 83 detennines whether the 
number of accumulated credits exceeds a predetermined number or whether die sender 

message lengu, which may be determined by the smaller of .he network MTU between 
the sender and me receive, and .he size of the buffer posted by the sender to receive credit 
messages. If either condition is satisfied, the credit list buUd^communicator 83 may 
communicate a first batch or lis. of credits to the sender. (ST4) For example, the credit 
15 ,i»buud«/commumc*ormayins™^ 

the sender, e.g., by posting a descriptor having a pointer to the credit message in the send 
^ue 0 f the control connection of the receiver and ringing the send queue doorbell. In 
an alternative embodiment, for example, where the <*nder and receiver communicate 
using shared memory, the oedi. lis. buuder/conununicater 83 may write the credit list ,0 
20 the^lporuonofthesharedmemory^ If n.i<her of the 

conditions is satisfied, tite credit lis. buuder/communicator may continue «o accumulate 
credits. In steps ST5 and ST6, the credit list builder/communicator 83 determines 
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whether data has been received from the sender for the first buffer specified in the first 
batch of credits communicated to the sender. If data has not been received in the first 
buffer, the credit list builder/communicator 83 preferably continues checking whether 
data has been received in the first buffer, i.e., without communicating a new list of credits 
5 to the sender. If the credit list builder/communicator 83 determines that data has been 
received in the first buffer specified in the first credit list, the credit list 
builder/communicator 83 determines whether new credits are available, (steps ST7 and 
ST8) If new credits are available, the credit list builder/communicator 83 preferably 
communicates a new credit list to the sender, (step ST9) For example, the credit list 

1 0 builder/communicator may instruct the I/O device 76 to send a new credit list to the 
sender containing newly accumulated credits. The newly accumulated credits may be 
based on receive buffers contained in data receive requests initiated by the receiving 
application after the previous credit message was sent If there are no newly accumulated 
credits, the credit list builder/communicator 83 may continue to check until new credits 

15 are available. After the new credit list is communicated to the sender, the credit list 

builder/communicator 83 determines whether data has been received in the first buffer 
specified in the new credit list (steps ST10 and ST1 1) If data has not been received in 
the first buffer in the new credit list, the credit list builder/communicator 83 preferably 
continues checking, i.e., without communicating another new credit list to the sender. If 

20 the credit list builder/communicator 83 determines that data has been received in the first 
buffer in the new credit list, the credit list builder/communicator 83 preferably checks 
whether new credits are available and instructs the I/O device 76 to send another new 
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credit list to the sender. 

approach ulus««ed in Figure 4, in which the receive, communion** a new 
Us, «. .he sender wh» ft. firs, buffer specified in a previous credi, Us, is used, 
^.henunrberof^avaaah.e^^s^era.anygrvenn.ne. Thisapproach 

5 may he desirab.e if fte sender is rapid* consuming avaUaMc credits, h, an att-nativ. 
approach, steps ST5, ST6, ST10, andSTll can be modified so flat die credit list 
fcaUder/conmmnicaror 83 determines .hen fl. Us, buffer in a previous credi. Us. is used 
^connnunicaflnganev, credi. lis.^sender. This approach wouid reduce the 
aamher of credi. fist communicaflons sen, by ft. receiver and .he number of credrts 
10 avaUabie ,. fte sender a, any given om=. Such an approach ma, be desirable if Are 

3^ .acrapidiy consuming avaUabiecredHs. m ye, a^Arer aUernative, Are credit hs, 
^communicaror 83 .ay in*ru« ft, VO device of fte receiver „ communicate new 
^Uastofte senderwhenabufferberweenfte firs. and ^ buffer in aprevrous 
credit list receives data from the sender. 

According «. anofter aspect of fte invention, ft. mefl»d for d«erntining when ft 
co^ca.n^credi^ftesenderisadapmb.e. Hgute 5 Ulusuates an ada P «ab,e 
ap^achfordetenniningwhenftcn^unic^^ ^XTX- 

step ST7, after data has been received in a first buffer specified in a first credit list 
20 previous* communicated » fte sender, fte credit Us, buUder/communica^ 83 

fte fiecuency a, which fte buffers are being used by Are sender. msftpSTg, 
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being used by the sender will trigger the sending of a new credit message, based on the 
frequency. For example, if the sender is rapidly using buffers in the current credit • 
message, a new credit message may be sent when a buffer near the beginning of the 
current credit message is used. On the other hand, if the sender is slowly consuming the 
5 buffers in the current credit message, the credit list builder/communicator 83 may wait 
until a buffer near the end of the current credit message is used to send the new credit 
message. In steps ST9 and ST10, the credit list builder/communicator 83 determines if 
the triggering buffer in the current credit message has received data from the sender. If 
the triggering buffer has not received data, the credit list builder/communicator 83 

1 0 preferably continues checking. If the triggering buffer has received data, the credit list 
builder/communicator 83 may determine if any new credits have been accumulated, 
(steps ST1 1 and ST12) If new credits have not been accumulated, the credit list 
builder/communicator may continue checking. If new credits have been accumulated, the 
credit list builder/communicator 83 may instruct the I/O device 76 to send a new credit 

15 list to the sender. In steps ST14 and ST1 5, the credit list builder/communicator 83 

determines whether data has been received in the first buffer in the new credit list If data 
has not been received in the first buffer, the credit list builder/communicator 83 
preferably continues checking. If data has been received, the credit list 
builder/communicator 83 returns to step ST7 to determine the frequency at which the 

2 0 sender is utilizing buffers and determine which buffer in the new credit message will 
trigger the sending of another new credit list In an alternative arrangement, step ST7 
may be executed continuously so that the triggering buffer can be updated continuously. 
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to to manner, the number of credit messages and credit made available * the sender is 
strolled based on me rate a. which me sender is using credit In another alternative 
arrangement, the times at which buffers in a firs, credit message am used may be input to 
an adaptive-predictive filter. Tie adaptive-predictive filter may predict when the sender 
5 will most likely need a new list of credits. 

The present invention is no. limned to utilizing a triggering buffer <o determine 
^,ocornm.micate a new ere*, lis.**, sender. For exampk, in an ahemative 
embodiment, steps ST9 and ST10 in Figure 5 may be replace by steps for determining a 
time, e.g, in milliseconds, for communicating a new credh lis. m me sender, tosuchan 
10 embodmten^thecrem.lis.buUder/commumcamrgSmaymonta 

* e sender consumes credits and, based on the (regency, dc^rmine U> communicate new 
credits after a pred«ermined time period elapses. Any method for adoptively 
determining when to communicate new credits m me sender u wimin dtc scope of tite 

invention. 

15 fa .ood« alternative embodintcn^ the credn^ 

implement quality of service functions by regulating the rate at which credits are 
communicated te me sender. For example, me receiver may comprise a server mat 
provides service* to a plummy of client senders. The receiver may prevent any one of me 
cUents from exceeding a predetermined maximum aUowable bandwidth by no. sending 
20 credit to me clien. when doing so would allow ,hc client* exceed the maximum 
aUov^lebandwidth. By preventing client horn exceeding a maximum allowable 
bandwidth using credits, me server can guarantee a certain quality of service <o aU cue*, 
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For example, since none of the clients can exceed the maximum allowable bandwidth, 
the processing load on the server is determined by the number of clients and the 
maximum allowable bandwidth. If the maximum allowable bandwidth is set somewhat 
lower than the bandwidth that the server is capable of servicing for a given connection, 
5 the server can service a greater number of clients with less hardware. In contrast, without 
a maximum bandwidth limitation, in order to guarantee service to a given number of 
clients, the server must contain sufficient resources to handle bursts of communications 
from the clients in excess of the average bandwidth. Thus, the credit-based methods and 
systems for regulating flow between a sender and the receiver can be used to facilitate 

1 0 server resource planning. 

According to another aspect of the invention, the sender may transmit in-band 
information to the receiver along with the data packets. The in-band information may 
include the cumulative amount of data remaining to be sent by the sender. The credit list 
builder/communicator 83 may utilize this information to switch between one or more of 

1 5 the approaches previously described for determining when to communicate new credits to 
the sender. Figure 6 illustrates an exemplary approach for switching between modes for 
determining when to communicate new credits to the sender. In step ST1 , the credit list 
builder/communicator 83 operates in a first mode for determining when to communicate 
new credits to the sender. The first mode may comprise the steps illustrated in Figure 4 

2 0 for sending new credit lists when a first buffer in a previous credit list triggers the sending 
of a new credit list In step ST2, the credit list builder/communicator 83 analyzes the in- 
band data transmitted from the sender. The in-band data may include any information for 
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^ ,he credh Us, buUder/communica,or 83 in deKrmnung when to communicate 
„ ae<lits to me sender. For example, «he in-band dam may include the amount of dam 

remaning to be sent by the sender. 

In step ST3, the credit list builder/communicator 83 may determine whether 
5 Etching won* increase performance, e. g „ by analyzins me amount of da* remaining 
.obesent.menumberofcredi* — g , and/or fietptency of buffer us^e. For 
e^mple, the credit lis, b^cornmunicamr 83 may compare the monitored information 
«, . ^Cd value or a set of threshold values «o determine whedrer to switch mode,. If 
a« a^ysis indicates mat me r^ at which credit are currently being communicnted to 

mode for de— when » communicate new credit me sender to slow the rate a, 
^ ch eredi* are conun^cated to the sender. (*epST4) OnU.eomerhand.if me 
^vsia indicntes that m. communication of credit is <oo slow, me credit list 
btt ader/communic*or83 may switchto aunrdmode for detenmmngwhento 
15 new credits to me sender to increase the rale of communication of credrts 

^sender. If swhehing modes would no, increase performance me cre<h. Us, 
Wder/conuuumcalo^nuv^ (stepST5).nd 
re.rnnmcheddngm.in-bandinformauon. The in-band info— may also be used as 

20 ^credi.mmesender.Byu^mWmformanon^mesender^de^ 

^^newcredUmes^.hcVOperf^of^^^^^'- 
^improved. For iusfcnce, me in-band information may be »uB»d » mduceme 
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number of mode switches when the data rate is highly variable. 

Figure 7(a) and 7(b) illustrate steps that may be performed by the credit list 
reader/processor 75 according to an embodiment of the present invention. Figure 7(a) 
illustrates exemplary steps that may be performed by the credit list reader/processor 75 
for receiving credits from the receiver and posting buffers to receive new credits. Figure 
7(b) illustrates exemplary steps that may be performed by the credit list reader/processor 
75 for processing credits to send data to the receiver. The steps illustrated in Figure 7(a) 
may be executed concurrently with the steps illustrated in Figure 7(b). In Figure 7(a), the 
credit list reader/processor 75 posts a first buffer for receiving credits and notifies the I/O 
device 70 of the posting, (step ST1) For example, the credit list reader/processor may 
write the virtual address of a descriptor pointing to the buffer for receiving credit 
messages to the receive queue of the virtual interface of the sender for sending and 
receiving credit messages and ring the associated doorbell. Alternatively, where credits 
are communicated to the sender using shared memory or RDMA write operations, the 
sender may ensure that buffer space in memory exists for receiving credits. The buffer is 
preferably posted before a connection is established between the receiver and the sender. 
In steps ST2 and ST3, the credit list reader/processor checks for credits transmitted from 
the receiver. Checking whether credits have been received may include reading the 
memory location or locations reserved for receiving credits. If credits have not been 
received, the credit list reader/processor may continue checking or waiting to be notified 
of the reception of credits. If credits have been received, the credit list reader/processor 
75 may post a second buffer for receiving new credits from the receiver (ST4). Once the 
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«^ the credit list reader/processor may store the credits from 
second buffer has heen posted, the crean ust rcauei y 

«T2 to check for credits in the buffer 
the first buffer in a credit list and returns to step ST2 to caeca 

posted in step ST4. (step ST5) 

m Figure 7(b). the credit Us. reader/processor 75 receives a request for sending 

5 ^ fi- fi— m » 

S^andsmthecm^render^^de^ if me credit list cnmms 

a^credhs. ,f te «-l*-~--"^ fc — *~ d "* , ~ 
75 ptefembiy oonfinues checking, i.e., without retruesang sending of the data. If*. 
10 credit list contains credits, the credit message receive^essor 75 may truest me 

(sttpS T 5 a) The credit list reader/processor 75 may then update a data pointer pointing to 
^ t „ be sen, and check whether any data remains to he sen. (sttps ST6a and ST7a, 
15 U^ren^mbesen.^cremtUs.reader/proeessm 

wtt .~-**~~'***—* — '~*- .f no d^a remain, 
„ * sent, rhe credit fist reader/processor may renun to *ep ST, a to receive the n«t 
^ouest for sending data ftom the sending application. 

By only sending data when credi. are available, me credh fist processor 

20 ^^^^^^^^^^^ 

, ~ f „_ vine credits before using newly- 
credit list reader/processor 75 posts a new buffer for receiving crecn 

received credits, the credit list reader/processor 75 allows the credit list 
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builder/communicator 83 of the receiver to maintain that the sender has a buffer available 
for receiving credits if the receiver receives data corresponding to the first credit in a new 
credit list Thus, embodiments of the present invention may also reduce the likelihood of 
data transmission overflow conditions in transmitting credit messages from the receiver 
5 to the sender. 

The credit list reader/processor according to the present invention is not limited to 
the embodiments illustrated in Figures 7(a) and 7(b). For example, as stated above, the 
credit list reader/processor 75 may control the transmission of in-band data from the 
sender to the receiver. The in-band data may be transmitted along with data packets from 
10 the sender to the receiver. The in-band data may include any data to assist the receiver in 
determining when to transmit credit messages to the sender. For example, the in-band 
information may include the amount of data remaining to be sent by the sender. 

Figure 8 is a flow diagram illustrating an example of the transfer of data and 
credit messages between the sender 60 and the receiver 62. Each row in the flow 
15 diagram indicates status information and action taken by the sender 60 and the receiver 
62. The first column CI in the diagram represents the receiver 62, including the credit 
list builder/communicator 83. the second column C2 represents the communication link 
64, and the third column C3 represents the sender 60 including the credit list 
reader/processor 75. In row Rl, column C3, the sender 60 has a send buffer 68(a) of 
20 seventy-two bytes to send to the receiver 62. The send buffer 68(a) may have been 
communicated to the credit list reader/processor 75 by the sending application of the 
sender. PB is a pointer to the first byte of the send buffer 68(a). The sender initially has 
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zero credits. In row Rl. column CI, the receiver has two available receive buffers 78(a) 
and 78(b) of size three bytes and seven bytes, respectively. In row Rl, column C2, the 
receiver 62 sends a credit message 82(a) to the sender 60 indicating the size of the buffers 
78(a) and 78(b). The sender maintains two pointers, Rtrip and Next Rtrip points to the 
5 first buffer indicated in a credit message sent to the sender. The pointer Next points to the 
first receive buffer not communicated to the sender in a credit message. Since there are 
no buffer sizes that have not been t«rt^l.to^ia«wM.4.p«*-«N-* 
equals zero. 

The pointer RHip may be used to indicate to the sender when new credit message 
10 buffer, are available a. the sender for receiving credit messages. For example, described 
above, if data is received in a buffer in a list of credits previously communicated to the 
sender, the receiver can assume that the sender has posted a new buffer for receiving 
credit messages. Accordingly, when dam is received in the tot buffer, the credit 
message sent to the sender, the pointer Rtrip may be se, to NULL, (see row R9, column 

IS CI) 

InrowR2, column C3, the credit list reader/processor 75 of the sender receives 
me credit message 82(a) and adds credits of three and seven to the credit list. ™e credit 
to reader/processor 75 pos* a firs, descriptor to the send aucue of the sender to send the 
tot toe bytes of the «nd buffer 68(a) and updates the buffer poinfor PB to point to foe 

20 next byte to be sent 

In row R3, column C2, the sender sends a dam packet 84(a) containing the firat 

to. bytts of the send buffer 68(a) to foe receiver. In row R3, column O, the credit Ust 



WO 00/41365 



PCT/US99/30860 



reader/processor 75 removes the used credit of three from the credit list, posts a descriptor 
in the send queue to send the next seven bytes of the send buffer 68(a) and updates the 
buffer pointer PB. The shaded bytes in the send buffer 68(a) indicate data that has been 
sent to the receiver. Thus, in row R3, column C3, the first three bytes of the send buffer 
5 68(a) have been sent In row R3, column CI, the credit list builder/communicator 83 has 
received a request for receiving data in a new buffer 78(c) of size twenty-two. Since this 
buffer 78(c) has not been communicated to the sender, the credit list 
builder/communicator preferably updates the pointer Next to point to this buffer. 

In row R4, column C2, the sender sends a data packet 84(b) containing the next 

1 0 seven bytes of the send buffer 68(a). In row R4, column C3, the sender credit list 

reader/processor 75 preferably removes the used credit of seven from the credit list. In 
this state, the sender has no credits and preferably does not send any more data until 
receiving more credits from the receiver. In row R4, column CI, the receiver receives the 
data packet 84(a) into the receive buffer 78(a). The credit list buflder/communicator 83 

1 5 has received a request for receiving data into a new buffer 78(d) of size forty-four. 

In row R5, column C3, the sender is idle because it has no credits. In row R5, 
column C2, the receiver sends a credit message 82(b) containing credits of twenty-two 
and forty-four to the sender. In row R5, column CI, the pointer Rtrip is updated to point 
to the receive buffer 78(c) corresponding to the first credit in the new credit message 

2 0 82(b). The credit list builder/communicator 83 sets the pointer Next to zero because there 
are no credits that have not been transmitted to the sender. The buffer 78(a) has been 
removed from the buffers available for receiving data because the descriptor for that 
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buffer has been processed. The receive buffer 78(b) receives the data packet 84(b). 

In row R6, column C3, the sender receives the credits twenty-two and forty-four. 
The credit list reader/processor 75 posts a descriptor in the send queue to send the next 
twenty-two bytes of the send buffer 68(a). The credit list reader/processor 75 updates the 
5 bufferpointerPB. In row R6, column CI , the receiver removes the buffer 78(b) from the 
list of buffers available for receiving data since it previously received data. 

In row R7, column C2, the sender sends a data packet 84(c) containing the next 
twenty-two bytes of the send buffer 68(a). In row R7, column C3, the credit list 
reader/processor 75 removes the used credit of twenty-two from the credit list. The credit 
1 o list reader/processor 75 posts a descriptor in the send queue to send the next forty bytes of 
data to the receiver and updates the buffer pointer PB. In row R7, column C 1 , the credit 
list builder/communicator 83 receives a request for receiving data into a new buffer 78(e) 
of size eleven. The credit list builder/communicator 83 updates the pointer Next to point 
to the new buffer 78(e)- 

15 m row R8, column CI, the receiver receives the data packet 84(c) in the receive 

buffer 78(c). In row R8, column C2, the sender sends a data packet 84(d) containing the 
last forty bytes of die receive buffer 68(a) to the receiver. In row RS. column C3, the 
ore* lis. reader/processor removes the used credit of forty-four from the credit list The 
use of a forty-four-byte credit to send a buffer of forty bytes Olusttutes an acceptable, but 

20 inefficientuseofacredit. Since all ofme dam has been sent d^ buffer poimerPB is set 

to NULL. 

In row R9, column CI , the receiver receives the data packet 84(d) in the send 
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buffer 78(d). The receive buffer 78(d) is removed from the set of receive buffers 
available to receive data since its descriptor has been processed. The pointer Rtrip is set 
to NULL to indicate the availability of a credit message buffer at the sender. Because the 
receiver sends a list of credits to the sender, the sender sends data to the receiver based on 
5 the size and order of the credits, and the receiver receives data into buffers according to 
the size and order of the credits, reliable flow control between the sender and the receiver 
can be achieved with reduced copying of data. 

In view of the many possible embodiments to which the principles of this 
invention may be applied, it should be recognized that the embodiments described herein 
1 0 with respect to the drawing figures are meant to be illustrative only and should not be 
taken as limiting the scope of invention. For example, those of skill in the art will 
recognize that the elements of the illustrated embodiments shown in software may be 
implemented in hardware and vice versa or that the illustrated embodiments can be 
modified in arrangement and detail without departing from the spirit of the invention. 
15 Therefore, the invention as described herein contemplates all such embodiments as may 
come within the scope of the following claims and equivalents thereof. 
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CLAIMS 

I claim: 

1 A method for controlling data flow between a sender and a receiver 
comprising: 

5 communicating a first credit list to a sender, the first credit list comprising a 

plurality of credits indicative of buffer sizes of receive buffers accessible by a receiver 
and capable of receiving data from the sender; and 

in response to receiving the first credit list, transmitting a data packet, from 
the sender to the receiver, the data packet being no greater in size than a first buffer size 
10 specified by a first credit in the first credit list 



2. The method of claim 1 comprising receiving the data packet into a receive 
buffer corresponding to the first credit and having the first buffer size. 

3. The method of claim 1 wherein transmitting the data packet includes 
transferring data from an application-level send buffer to an input/output device without 
copying the data between the application-level receive buffer and the input/output device. 



15 



4. The method of claim 2 comprising after receiving the data packet, 
2 0 communicating a second credit list to the sender. 

5. The method of claim 3 wherein transferring data from the application-level 
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receive buffer to the input/output device includes posting a descriptor in a send queue 
associated with the input/output device and ringing a doorbell associated with the send 
queue. 



5 6. The method of claim 1 wherein communicating the first credit list to the 

sender includes transmitting a first credit message including the first credit list from the 
sender to the receiver. 



7. The method of claim 1 wherein communicating the first credit list to the 
1 0 sender includes writing the first credit list into a memory buffer shared by the sender and 
the receiver. 



8. The method of claim 1 wherein the sender executes on a first computer 
and the receiver executes on a second computer and communicating the first credit list to 

15 the sender includes performing a remote direct memory access write operation from the 
second computer to memory of the first computer to write the first credit list to the 
memory of the first computer. 

9. The method of claim 1 comprising establishing at least one first 

2 0 connection for transmitting data packets between the sender and the receiver and 

establishing a second connection between the sender and the receiver for transmitting 
credit messages between the sender and the receiver. 
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10. The method of claim 9 wherein establishing at least one first connection 
includes estabushing a plurality of first connections for transmitting data packets 
between the sender and the receiver. 

11. The method of claim 10 comprising multiplexing credits messages on the 
second connection, the credit message including credits indicating receive buffer sizes 
for each of the plurality of first connections. 

12 A credit list builder/communicator comprising computer-executable 
actions embodied in a computer-readable medium for performing steps comprising: 

receiving requests for receiving data into a plurality of receive buffers accessible 
by a receiver and capable of receiving data from a sender, 

in response to the requests, building a credit list including a plurality of credits 
indicative of sizes of the plurality of receive buffers; and 
communicating the credit list to the sender. 

13. The credit list builder/communicator of claim 12 comprising computer- 
executable instructions for determining when to communicate credits to the sender. 

14. The credit list builder/communicator of claim 13 wherein the computer- 
executable instructions for determining when to communicate credits to the sender 
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include instructions for monitoring a frequency of credit usage by the sender. 

15. The credit list builder/communicator of claim 13 wherein the computer- 
executable instructions for determining when to communicate credits to the sender 

5 include instructions for setting a maximum allowable bandwidth for receiving data from 
the sender and refraining from communicating additional credits to the sender when the 
additional credits would allow the sender to exceed the maximum allowable bandwidth. 

16. The credit list builder/communicator of claim 12 wherein the computer- 

1 0 executable instructions for communicating the credit list to the sender include instructions 
for transmitting a credit message including the credit list from the receiver to the sender. 

17. The credit list builder/communicator of claim 12 wherein the computer- 
executable instructions for communicating the credit list to the sender include instructions 

15 for writing the credit list to a memory buffer shared by the sender and the receiver. 

1 8. The credit list builder/communicator of claim 12 wherein the sender 
executes on a first computer and the receiver executes on a second computer and the 
computer-executable instructions for communicating the credit list to the sender include 

2 0 instructions for performing a remote direct memory access from the second computer to 
memory of the first computer to write the credit list to the memory of the first computer. 
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19. The credit list builder/communicator of claim 12 wherein the computer- 
executable instructions for communicating the credits to the sender include instructions 
for inserting the credit list in an options field in a TCP packet 



5 20. 
comprising: 
a 



A data structure for controlling data flow between a sender and a receiver 



credit list including a plurality of credits, each of the credits being indicative 
of a buffer size of a receive buffer accessible by a receiver and capable of receiving data 
from a sender. 

10 

21. The data structure of claim 20 wherein the plurality of credits are arranged 
in an order corresponding to an order of posting of descriptors in a receive queue of the 
receiver. 

15 The data structure of claim 20 wherein the credit list is included in a credit 

message transmitted from the receiver to the sender through a network. 

23. The data structure of claim 20 wherein the credit list is stored in a memory 
buffer shared between the sender and the receiver. 

20 

24. The data structure of claim 20 wherein the credit list is included in a 
remote direct memory access write packet transmitted from the receiver to the sender 
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through a network. 
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20 



in 
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25. The data structure of claim 20 wherein the credit list is included in an 
options field of a TCP packet transmitted from the receiver to the sender. 

26. A credit list reader/processor comprising computer-executable instructions 
embodied in a computer-readable medium for performing steps comprising: 

posting a first buffer accessible by a sender for receiving credits from a 

receiver; 

deterrnining whether credits have been received in the first buffer, 

response to receiving credits in the first buffer, posting a second buffer 
ihle by the sender for receiving additional credits from the receiver, and 

after posting the second buffer, storing credits from the first buffer in a credit 

list 

27. The credit list reader/processor of claim 26 wherein the credits received in 
the first buffer are arranged in a first order and the computer-executable instructions for 
storing the credits in the credit list comprise instructions for storing the credits in the first 
order. 

28. The credit list reader/processor of claim 26 comprising computer- 
executable instructions for , after storing the credits in the credit list, requesting 
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transmission of data packets to the receiver, the data packets having sizes controlled by 
the credits in the credit list 

29. The credit list reader/processor of claim 28 comprising computer- 
5 executable instructions for removing a credit from the credit list after requesting 

transmission of each data packet to the receiver. 

30. The credit list reader/processor of claim 29 comprising computer- 
executable instructions for, when the credits in the credit list are exhausted, delaying 

10 requesting of transmission of data packets to the receiver until new credits are received 
from the receiver. 

31. A credit list buUder/communicator comprising computer-executable 
instructions embodied in the computer-readable medium for performing steps for 

1 5 determining when to communicate new credits to a sender comprising: 
communicating a first credit list to a sender; 

determining if data has been received in a first buffer corresponding to a 

credit in the first credit list; and, 

in response to determining that data has been received in the first buffer, 

2 0 communicating a second credit list to the sender. 



32. 



The credit list buUder/communicator of claim 31 wherein the first buffer 
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corresponds to a first credit in the first credit list 

33. The credit list builder/communicator of claim 3 1 wherein the first buffer 
corresponds to a final credit in the first credit list. 

5 

34. The credit list builder/communicator of claim 3 1 wherein the first buffer 
corresponds to a credit between a first credit and a final credit in the first credit list. 

35. A credit list builder/communicator comprising computer-executable 
10 instructions embodied in a computer-readable medium for performing steps for 

detennining when to communicate new credits to a sender comprising: 
co mmuni cating a first credit list to a sender; 

monitoring a frequency at which the sender consumes credits in the first credit 
list; and 

1 5 determining when to communicate a second credit list to the sender based on the 

frequency. 

36. The credit list builder/communicator of claim 35 wherein the computer- 
executable instructions for detennining when to communicate the second credit list to the 
2 0 sender include instructions for computing a time in time units for communicating the 
second credit list to the sender. 
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37. The credit list builder/communicator of claim 35 wherein the computer- 
executable instructions for determining when to communicate the second credit list to the 
sender include instructions for detennining a triggering buffer in the first credit list for 
triggering communication of the second credit list to the to sender. 

5 

38. A credit list builder/communicator comprising computer-executable 
instructions embodied in a computer-readable medium for performing steps comprising: 
operating in a first mode for determining when to communicate new credits to 

a sender, 

1 0 receiving in-band information from the sender; 

analyzing the in-band information; and 

if the in-band information indicates that switching would increase 
input/output performance, switching to a second mode for determining when to 
communicate new credits to the sender. 



15 



20 



39. The credit list builder/communicator of claim 38 wherein the in-band 
information includes an amount of data remaining to be sent from the sender to the 
receiver. 

40. The credit list builder/communicator of claim 38 comprising computer- 
executable instructions for refining from switching from the first mode to the second 
mode if a variance in the rate for receiving data packets from the sender exceeds afirst 
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value, based on the in-band information. 

41. An input/output device comprising: 
a processing circuit; 

5 a memory device coupled to the processing circuit, the memory device 

storing a credit list builder/communicator including computer-executable instructions for 
performing steps comprising: 

receiving requests for receiving data into receive buffers stored at virtual 
memory locations of a host computer connectable to the input/output device; 
1 0 building a credit list including a plurality of credits indicative of sizes of 

the receiver buffers; and 

communicating the credit list to the sender. 

15 42. An input/output device comprising: 

a processing circuit; 

a memory device coupled to the processing circuit, the memory device 
storing a credit list reader/processor including computer-executable instructions for 
performing steps comprising: 
2 0 posting a first buffer accessible by a sender for receiving credits from a 

receiver; 

determining whether credits have been received in the first buffer; 
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in response to receiving credits in the first buffer, posting a second buffer 
accessible by the sender for receiving additional credits from the receiver, and 

after posting the second buffer, storing credits from the first buffer in a 

credit list 

43. The input/output device of claim 42 wherein the credit list 
reader/processor comprises computer-executable instructions for reading the credits in the 
credit list and requesting sending of data packets to the receiver having sizes based on the 
credits. 

44. A network communications system comprising: 

a first local virtual interface for sending data to and receiving data from a 
first remote virtual interface over a first network connection; 

a second local virtual interface for sending credit messages to and 
receiving credit messages from a second remote virtual interface over a second network 
connection; and 

a credit list builder/communicator for building credit messages for 
controlling data flow over the first network connection and communicating the credit 
messages to the second remote virtual interface through the second local virtual interface 
and the second network connection, the credit messages including credit lists including a 
plurality of credits indicative of buffer sizes of receive buffers for receiving data through 
the first local virtual interface from the first remote virtual interface. 



60 - 



WO 00/41365 PCT/US99/30860 



45. The network communications system of claim 44 comprising a credit 
message reader/processor for reading credit messages received from the second remote 
virtual interface through the second network connection and the second local virtual 

5 interface and requesting sending of data packets to the first remote virtual interfece 
through the first local virtual interface, the data packets being having sizes based on 
credits in the credit messages received from the second remote virtual interface. 

46. The network communications system of claim 44 comprising a plurality of 
10 first local virtual interfaces for sending data to and receiving from a plurality of first 

remote virtual interfaces through a plurality of first network connections. 

47. The network communications system of claim 45 comprising a plurality of 
first local virtual interfaces for sending data to and receiving from a plurality of first 

1 5 remote virtual interfaces through a plurality of first network connections. 

48. The network communications system of claim 46 wherein the credit list 
builder/communicator builds credit messages for controlling data flow over the plurality 
of first network connections, the credit messages for controlling data flow over the 

2 0 plurality of first network connections including credits indicative of buffer sizes for 
receiving data through the plurality of first local virtual interfaces. 
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49. The network communications system of claim 48 wherein the credit list 
builder/communicator multiplexes and sends the credit messages for controlling data 
flow over the plurality of first network connections to the second remote virtual interface 
through the second local virtual interface and the second network connection, wherein 
5 each of the plurality of credit messages for controlling data flow over the plurality of first 
network connections indicates one of the plurality of first remote virtual interfaces to 
which the credits in the credit message pertain. 

50. The network communications system of claim 47 wherein the credit 
10 message reader/processor receives multiplexed credit messages for controlling data flow 
over the plurality of first network connections, demultiplexes the credit messages and 
sends data packets to the plurality of first remote virtual interfaces, the data packets 
having sizes based on credits in the credit messages for controlling data flow over the 
plurality of first network connections. 



15 



51. The network communications system of claim 46 comprising a plurality of 
second virtual interfaces, each of the second virtual interfaces for receiving credit 
messages for controlling data flow over one of the plurality of first network connections. 
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